Skip to content

Conversation

@klausler
Copy link
Contributor

…tion

When scanning list-directed input for nulls and repetition counts, the current library depends on having each record be prescanned for the presence of asterisk characters. It turns out that the overhead of calling memchr(...,'*',...) on each record doesn't pay off, especially on systems without SIMD-vectorized memchr implementations -- even on those, it's faster (about 10%) to just scan ahead for asterisks when decimal digits are encountered. Only when an asterisk is present, which is not common, should we then bother to convert the digits to their integer value.

Copy link
Contributor

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thank you!

…tion

When scanning list-directed input for nulls and repetition counts,
the current library depends on having each record be prescanned for
the presence of asterisk characters.  It turns out that the overhead
of calling memchr(...,'*',...) on each record doesn't pay off, especially
on systems without SIMD-vectorized memchr implementations -- even on those,
it's faster (about 10%) to just scan ahead for asterisks when decimal
digits are encountered.  Only when an asterisk is present, which is
not common, should we then bother to convert the digits to their integer
value.
@klausler klausler merged commit ea5262f into llvm:main Sep 23, 2025
9 checks passed
@klausler klausler deleted the faster-int-input branch September 23, 2025 22:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants